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ABSTRACT 

The purpose of the conference, "Wagging the Dog, 
Carting the Horse: Testing vs. Improving California Schools," was'to 
discuss alternative perspectives on testing and evaluation in 
education and their role in improving teaching and learning. Four 
papers were presented: (1) "Using Educational Evaluation for the 
Improvement of California Schools," by Elliot Eisner; (2) "Evaluating 
Educational Quality: A Rational Design," by Eva L. Baker; (3) "The 
Influence of Testing on Teaching and Learning," by Norman 
Frederiksen; and (4) "Beyond Outcome Measures: An Agenda for School 
Tmprovement," by John Goodlad. These papers, and the small group 
discussions at the conference, are summarized in this paper. The 
conference participants are listed, and the conference program is 
appended. (BW) 
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^4««c nf "Uaaainq- the Dog, Carting the 
This document summarizes proceedings of Wagginj 

. r <;choo1s." a conference sponsored 

Horse- Testing vs. Improving California Schools. 

jointly by .he UCLA Center for the Study of Evaluation and the UCLA 

Uboratory in School and Community Education. The one day conference was 

California. ^, 

purpose of the conference was to discuss aUernative perspectives 

on testin, and evaluation In education and their ro.e In l^provln, teacMn, 
and 1earn1n,. The conference considered whether. In the current rush to 

signt Of what 1s meant by a "quality education." More particularly. 

speakers posed - and provided aUernat swers to - the questions of 

What shoul. oe assessed and how evaluation can best contribute to our 

understanding of schools and to their Improvement. 

The conference attracted a diverse audience of professional educators. 

school board «<en^.ers. educational researchers and policy-makers. It 
featured presentations by Professor Eva Ba.er. Directors of the UCLA Center 
for the Study of Evaluation. Professor ElUot Eisner. Stanford University; 
or. No«an FrederlcRson. Educational Testing Service; and Professor John 
I Goodlad. former Dean of the UCLA Graduate School of Education, 
presentations were followed by questions and answers and small group 
discussions. (A co,y Of the program is provided 1n the appendix.) A 
summary of the presentations and small group discussions Is provided In the 
following pages, followed by a listing of conference participants. The 
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full text from the presentations can be found In "1984 Policy Studies: The 
use of testing and evaluation for assessing educational quality and 
Improving school practice." 
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Using Educational Evaluation for the Improvement 
of California's Schools 
Elliot Eisner 

Measurement, evaluation, and testing should be viewed as three 
independent processes. Measurement is an arbitrary means of quatifying and 
describing something without making value judgments about its quality. 
Evaluation involves making value judgments about something on the basis of 
some relevant criteria. Testing is a way of getting information by 
eliciting a response to something. 

However, while we can measure without evaluating, can evaluate without 
measuring and testing, and can test without measuring, the three are often 
confounded. For example, when newspaper headlines signal decline in test 
scores, these interpreted test scores, though they provide no information 
about school context and practice » have an effect on the setting of 
educational priorities and tho educational climate of schools. One has 
only to spend time in a classroom to see the effect of tests and test 
scores on teachers' practice. 

What we need to do, instead of over-reliance on test scores, is to 
conduct evaluations that examine classrooms in the context of schooling, 
evaluations which would thus have the potential to improve the quality of 
schooling. What we need to do is to subject educational planning, 
curriculum development, and instructional content and strategies to 
evaluation, and in order to make effective uses of evaluation, we need to 
focus on the classroom unit. 

Unfortunately, we have created a situation that makes it difficult for 
evaluators to spend time in classrooms and equally difficult for teachers 
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to get feedback on how well they are running their classes. We have 
created a situation In which It Is sometimes Impossible for teachers to 
discover ways to become better at their jobs. Over-reliance on tests as 
"evaluative" Information has contributed to this problem. 

To help overcome this situation, we need to begin stressing 
evaluation's educational role, to use evaluation to inform teachers about 
significant educational practice. To be able to do this, we must provide 
teachers with access to each other and to establish a climate of trust so 
that teachers will be willing to accept the observations of their 
colleagues. Such observation should be designed to describe the subtle but 
significant events that take place In a school and to provide feedback to 
teachers that they can use to modify their classroom practice. Of 
necessity, this description must move beyond the purely quantitative 
Information provided In test scores. 

A prime Ingredient In the process outlined above Is to reconcelve our 
notions of Inservlce. We need to view Inservlce as a means of stressing 
professionalism, as a vehicle which offers teachers the critical support of 
their colleagues, as a way of stimulating teachers to become connoisseurs 
of educational practice. Such a climate will do much more to Improve 
education than current attempts to humiliate tedChers Into excellence by 
publishing their students' test scores. 

Instead of trying to bully schools Into quality education, we need to 
give teachers a stake In what they teach, we need to have diverse programs 
which use nwltlple criteria, and we need to create a climate which fosters 
and encourages teachers' professional growth. 
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Evaluating Educational Quality: A Rational Design 

Eva L. Baker 

Evaluation reflects the viewpoint that we can influence the course of 
educational events by planning, implementing, and assessing. However, 
evaluation does not work this way very often. 

Though many models of evaluation have been proposed -- criterion- 
referenced, norm- referenced, goal -oriented, responsive, and so forth — the 
one that we need — effective -- has eluded us for a variety of reasons. 

For evaluation to be used it must be usable, meaning that it should 
reach people who can act on it, it should reach them in a timely manner, 
and it should be valid and credible. Above all, to be used for school 
improvement, evaluation must be aimed at the principal unit of change — 
the school. 

^tost evaluations, unfortunately, are driven by a different reality. 
They are mandated from above, usually to meet the legitimate questions of 
school boards and government agencies about the effectiveness of 
education. These questions deal with educational processes such as quality 
of services as well as questions about what and how well students learn. 

But while evaluation needs to generate information that will 
contribute to responsible oversight of the educational system, it needs 
also to provide information useful at the point of change, the local 
school. And therein lies the dilemma: the mismatch between top-down, 
externally mandated evaluation requirement and bottom-up, locally 
responsive efforts. 

A system for accountability and oversight is driven from the top down 
and demands comparability of assessment in areas requiring policy 
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decisions. Point of change evaluation is driven from the bottom up and 
emphasizes the uniqueness of each school and its staff, setting, students, 

and social' context. 

Top-down evaluations usually rely upon commercial achievement tests to 
generate the information they need for comparison purposes. Bottom-up 
evaluations require more finely-grained information about student 
performance. While the two systems make overlapping demands, they also 
differ tremendously. Those within the schools often find little use in the 
information provided solely for top-down, policy needs. It is possible, 
however, to reconcile and merge the two viewpoints efficiently so that 
policy needs are met while maintaining the ijersonal ity, integrity, and 

1 
! 

idiosyncracy of individual schools. | 

Such a system, embracing both top-down; and bottom-up needs, would 
allow for cross-student and school comparison. However, it would also 
allow for local option, quick turn-around outcomes measured across time, 
with possible multiple data sources. It would also address quality of 
school life, quality of effort, instructional resources, and include 
measures of process/outcome, affect, and overall context. It would provide 
a comprehensive evaluation system that could help direct a school 
improvement agenda. 
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The Influence o* Testing, on Teaching and Learning 

Norman Frederlksen 

/ 

' /■ 

There is little question that tests Influence what Is taught and what 
Is learned. Students, for example, adopt different study methods for 
different test formats. If a multiple choice test Is expected, the> will 
try to learn factual material. If an essay test is expected, they will be 
more Inclined to look for broader concepts and their relationships. 

This kind of test Influence would no^t be bothersome if students were 
exposed to a variety of test formats. But it seems that the numbers of 
multiple choice tests given to students each year has grown enormously. 
For example, because it is easier to write multiple choice items that 
measure factual knowledge, item writers tend not to write items measur-lng 
skills in analysis, problem solving, and application. Further, due to 
increased pressure to teach minimum competency skills, there is less effort 
to teach Important skills that are difficult to measure with multiple 

choice tests. / 

Certain trends^seem to be emerging from these practices. There is 
research showing that while performance on test items measuring the basic 
skills has not declined, performance on Items tapping the more complex 
cognitive skills has. It seems clear that we need tests, then, which 
measure not only the basic skills but also the ability to process 
Information rapidly and accurately, to apply principles in new situations, 
and to solve problems not previously encountered. 

There are various alternatives to multiple tests. The essay test Is 
one such possibility. And although essay tests are sometimes criticized 
for their scoring time and low reliability, a variety of procedures exist 
.,,11 



both for decreasing the time required for scoring and for increasing rater 
reliabnity. 

Other testing alternatives, many of which are quite different from 
conventional tests, have grown out. of theories of cognition. One such idea 
is concerned with measuring speed in perfoning cognitive tasks. Further, 
it 1s possible to combine both speed and power (the more conventional 

approach) in a test. • 

Similarly, it 'is possible to devise tests which tap both short- and 
long- term memory, and there are various approaches to assessing the 
processes a student brings to bear in representing a given problem. 
Different scoring procedures exist for each of these alternatives. 

An important feature of the alternatives outlined above is that they 
represent tasks as well as constituting tests. Greater consideration needs 
to be given to task assignments such as writing papers, solving 
assignments, and taking tests. If we begin to view tests as tasks, we will 
be helping students to acquire not only the knowledge base but also the 
information-processing skills that are necessary to developing high levels 
of proficiency in thinking. 
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Beyond Outcome Measures: 
An Agenda for School Improvement 
John Goodlad 

The current furor over school reform needs to be placed 1n 
perspective. The decline 1n competence 1n schooling and the Increasing 
disaffection In schooling that occurred In the 1970s is closely linked to 
declining faith In our Institutions In general and to economic downturns In 
the same period. 

Publication of A Nation at Risk led to a galvanic connection of 
achievement test scores with school health. That is just as mediocrity 1n 
the schools was seen as reflected In declining achievement scores, so was 
improved school health to be seen in Increased achievement scores. 
But If the schools are 1n the poor condition that many suggest, It is going 
to take a long period of care to bring them back to a condition of health. 
Further, achievement test scores will continue to be a poor indicator for 

judging that health. 

How do we arrive at a more accurate diagnosis of the health of 
schooling, one that we can then use to prescribe a remedy? First, we need 
to view the healthy school as one which assures comprehensive democratic 
access to the domains of knowledge that constitute a good general 
education. Second, we need an overall evaluative system comnensurate with 
these expectations. That system, essentially, will place much greater 
emphasis on the context of schooling, on the conditions of schooling which 
promote or Impede healthy growth. We need to examine, for example, 
satisfaction, school climate, classroom climate, principal -teacher 
relationships, school -community climate, and the like. Features such as 
ERjC these are not tapped by achievement tests. 
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Having assessed critical features of the schools environment, we can 
then bring to bear the value system of the professionals in the school to 
set an agenda for improvement. And instructional assessment plays an 
Important role in monitoring progress in the agenda for Improvement. 

However, such assessment will bear little resemblance to the typical 
achievement test. Rather, it will be concerned with the provision and 
assessment of an array of learning experiences commensurate with our 
expectations of a healthy school. Such an assessment system would engage 
children in solving real problems. It might have children working on 
problems for which there is no reward. It would engage in modes of inquiry 
commensurate with what we think real learning is. It would provide them 
with good guidance and feedback. 

This kind of evaluation system will allow us' to take the longitudinal 
view required to gradually bring our schools back to health. 
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Summary of Small Group Discussions 



The conference provided participants with the opportunity to come 
together In small groups to discuss the Implications of the conference for 
testing policies and school renewal. Small groups were constituted to 
enable Interaction between researchers, practitioners and policy makers. 
The conclusions of the small group discussions were as follows: 
Group 1: 

1. There Is a fundamental conflict between the ideal and the real. 
We know that standardized test scores are not adequate or valid 
indicators of schools, but given the political realities, we must 
pursue them. 

2. We need to build better coalitions to Influence the political 
process. We need to organize researchers and practitioners so that 
they can influence program mandates and help enact programs which 
actually facilitate real change and renewal. 

3. We need to find better ways to communicate our progress to our 
school constituencies. We need to educate them about the limits of 
standardized tests and share with them a broader and more 
comprehensive picture of school progress. 

Group 2: 

1. We need to assume the challenge of reeducating our staffs, 
community, and districts to issues of renewal. We need to provide 
time for dialogue between and among units to set goals and plan 
solution strategies. 

2. We need to concentrate on developing professional leadership that 
can refocus and broaden educational reform. We need to build trust, 
support risk-taking and experimentation, provide incentives and 
rewards, and harness the time and resources needed for renewal. 

3. We need to develop alternative assessment measures to help us 
analyze and improve our progress. We need measures that are sensitive 
to what teachers are trying to accomplish, that can serve as formative 
checkpoints, and that tap higher level critical thinking and 
problem-solving skills. 
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Group 3: 

1. We need a strong, confident offensive for change. Rather than 
being defensive, we need to be proactive in setting the agenda for 
schools. 

2. We need to provide quality time to promoti dialogue among 
teachers, principals, districts and their communities to assess their 
needs and goals and to plan for improvement. We particularly need to 
bring teachers back into the dialogue. 

3. We need a strong, professional teaching staff in order to promote 
renewal, we need to find the time, resources and incentives to 
"reprofessionalize" them and to facilitate their continued growth and 
satisfaction. 

Group 4: 

1. We need to face the realities of current testing practices. We 
must do well to satisfy our public. 

2. We need to develop a variety of ways to assess school programs. 
We need measures of higher level thinking skills, attitudes, and other 
indicators of school climate and process. 

3. We need to reeducate the public, and the media in particular as to 
what are accomplishing. 
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Appendix A 
9:00 - 9:30 REGISTRATION/COFFEE 

9:30 - 9:45 INTRODUCTION 



Or. Paul E. Heckman, Assistant Director, Laboratory 1n 
School and Cummunlty Education, University of 
California, Los Angeles 



9:45 - 10:30 Dr. John I. Goodlad, Professor and Co-Director of the 

Laboratory In School and Community Education, 
University of California, Los Angeles 

BEYOND OUTCOME MEASURES: 

AN AGENDA FOR SCHOOL IMPROVEMENT 



10:30 - 11:15 SMALL GROUP DISCUSSIONS 



11-15 - 12:00 Dr. Eva Baker, Professor and Director of the Center for 

the Study of Evaluation, University of California, Los 
Angeles 

ASSESSING LOCAL EDUCATIONAL QUALITY: 
A COMPREHENSIVE SYSTEM 



12:00 - 1:00 LUNCH 
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Dr. Norman Frederik sen, Educational Testing Service, 
Princiton, New Jersey 

INFLUENCES ON TESTING AND LEARNING 

SMALL GROUP DISCUSSIONS 

BREAK 

Dr. Elliot Eisner, Professor, Stanford University 

USING EDUCATIONAL EVALUATION FOR THE IMPROVEMENT OF 
CALIFORNIA'S SCHOOLS 

PANEL DISCUSSION 

Dr. Eva Baker 

Dr. Elliot Eisner 

Dr. Nortnan Frederlksen 

Dr. John I. Goodlad 

RECEPTION 



